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GLUBOPES - MULTIPLICrnES OF P ROTEINS TAXABLE OF RT] ^TMr. a 
VARIETY OF SMA LL MOLFPT TT pfr 

Technical Fidd 

The invention relates generally to proteins that are capable of binding small 
molecules; and, more panicularly, to the creation of families of proteins that result 
from randomization or other alteration of solvent-accessible loops substantially 
irrelevant to the remainder of the protein to confer on the family a range of binding 
affinities for small molecular targets. 



Background 

Natural selection in biological systems has resulted in the evolution of a number 
of macromoiecules having the capacity to bind small molecular targets. Such 
macromolecules can be referred to as "ligates" in recognition of their ability to bind to 
cognate "ligands". Many naturally selected ligates. particularly protein ligates, exhibit 
very specific, high affinity binding with their cognate ligand. Typical examples include 
hormone receptors and their corresponding hormones; and enzymes and their 
corresponding substrates. 

Such naturally occurring ligates. and/or their cognate ligands and analogs 
2 0 thereof, can be employed in a broad variety of applications; including analytical, 

diagnostic and therapeutic applications. However, the Ugates .available in nature for a 
panicular purpose may have inappropriate specificities, be too costly to manufacture, 
or may have other physical properties that make them undesirable. Therefore, 
additional sources of ligates besides those that nature provides, would be desirable. 

Various approaches to obtaining large families of additional potential ligates 
have been reported. Kaufinan (International application PCT/CH85/00099) describes 
the generation of large numbers of proteins using random DNA sequences for 
recombinant production of these potential ligates. Ladner, US patent 5,223,409. 
describes coupling of such variants to a genetically amplifiable unit, such as 
bacteriophage coat protein. Various alterations in antibodies to create subfamilies 
have also been attempted. The identification of oligonucleotides appropriate for use as 
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Ugates is also described in PCT application WO94/08050 using selection procedures 
from large random mixtures of nucleic adds. 

All of these approaches suffer from the exponential growth in numbers of 
possible variants as the number of monomers in the mixture of polymers increases - 
5 i e., a "combinatorial explosion." 

Phage display libraries containing random mutations at positions 107. 108. 110. 
Ill, 208, 213. 216. 219, 220 and 222 of GST Al-1 were prepared by Widersten. M. 
era/. mdBiQl (1995) 250:115-122 Random mutations at positions 9-14. 102-112 
and 210-220 of GST 2-2 were reported by Gorelidc, A. et al. PmrNnTl Ar-ad SciVSA 

10 (1995)22:8140-8144. 

One approach to overcoming this combinatorial explosion has been descnbed 
by one of the presem invemors in U.S. patents 5.133.866 and 5,340.474. Systematic 
variation of the monomers results in a representative family using smaller numbers of 
polymers. Others have approached this problem by systematically varying, for 
15 example, one residue at a ume. Huang, X. et al. StmsmisLBml (1994) 1:226-230 

describe the preparation of a random library of myoglobin mutants prepared by usmg a 
single-base misincorporation random mutagenesis method. Palzkill, T. et al. in 
r.,.,^^ .. c.n.^.r. F» »^inn.nd Genetics (1992) 14:29-44 describe a mutagenesis 
technique which randomizes the nucleotide sequence in a 3-6 codon reg.on of a gene 
20 and then determines the percentage of random sequences that produce functional 
protein, where a lo^^ercentage of functaonality indicates that mutageneic region .s 
important for the stnicture and/or function of the protein. 

Chimeric forms of gluttthione transferases (GSTs) have also been prepared 
Bjomestedt. R. etal. Bifishfiini (1992) 282:505-510 describe a human/rat chimera 
25 composed of a human alpha subunit I from the N-tenninus to Hisl43 or to Pro207 

followed by the complementary Cterminal ponion of rat alpha-I subunit. In addmon. 
there have been studies to elucidate the function of particular residues in the isoenzyme 
GST Pl-1. See Ricci. G. e, al IBioLChem (1995) 220: 1243-1248; LoBello. M. et al 
ibid: 1249-1253. 

30 Nature's approach to generation of binding agents to a wide range of target 

small molecules is refleaed in the generation of antibodies against virtually any target. 
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All vertebrates have large genomic loci for generation of the inuntmoglobulin 
repertoire wherein the variable regions of antibodies are assembled in response to 
stimulation by an antigen by rearrangement of individual portions of these loci to resuh 
in suitable binding characteristics. It is known that the regions specifically responsible 
5 for antigen binding, the complementarity determining regions (CDRs) are supported on 
a scaffolding of framework regions (FRs) and held in juxtaposition appropriate for 
antigen recognition. Immunoglobulins appear to be the only "family" of proteins that is 
known to have been created naturally to bind such a multiplicity of targets. 

The present invention provides families of protein ligates, termed "glubodies", 
1 0 that are capable of binding a variety of small molecular ligands. Such families will be 
useful as sources of new ligates for the above described appUcations including 
analytical and diagnostic applications. 



Disclosure of the Invention 
15 The present invention provides families of potential ligates that are capable of 

binding a wide variety of small molecules. The families of the invention are obtained 
by taking advantage of a "loop" structure in a native protein and providing alterations 
in the loop to confer differing binding characteristics depending on the nature of the 
alterations. 

2 0 The forms of the naturally occurring proteins that contain modified loops can 

be designated "glubodies", since they are, in a sense, analogs of antibodies which are 
capable of binding small molecules or what would correspond to a hapten. Since the 
glubodies are modified forms of naturally occurring proteins, these naturally occurring 
proteins can be called "protoglubodies". There are two regions of significance in the 

2 5 protoglubody — a "protogludomain", which is the loop region, and the "framework" 

region. In the glubody, only the protogludomain has been modified. 

Thus, in one aspect, the invention is directed to a method to prepare a 
multiplicity of member protein ligates (which collectively bind to or react with a variety 
of ligands), which method comprises identifying a protogludomain in a protoglubody 

3 0 protein and altering the protogludomain of each member of the family or multiplicity of 

protoglubody protein molecules. The alteration is different for each member. The 
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alterations in the protogludomain comprise substitution of one amino add for another 
and/or deletion of one or more amino acids and/or insertion of one or more amino 
acids. Such alterations also include the randomized substitution of a segment of amino 
acids that are contiguous in the protogludomain. wherein said segment of amino acids 
5 comprises at least about three amino adds and fewer than about twenty amino adds, 
more preferably between about four and fifteen, still more preferably between about 
five and twelve. The randomized substitution is achieved by replacing amino adds 
with any member of a predetermined set of replacement residues. 

Other aspects of the invention include the families of glubodies created by the 
10 method of the invention, families of nudeic adds encoding these families of glubodies. 
methods to produce the famiUes by expressing the modified polynucleotides, and 
methods to utilize these families to select ligates or glubodies of panicular desired 
properties, and in panels for analytical purposes. 

15 Rrief Descri ption of the I>ravwigs 

Figure 1 shows the ribbon structure of human GST P-1-1. including S-hexyl 
glutathione docked at the binding site. 

Figure 2 shows a pair-wise comparison of glubodies of the Gb/P204 family 
with native GST and with other members of the family. 
20 Figure 3 shows the gray-scale representation data firom Tables 2 and 3 

(inhibition patterns of various Gb^36 and Gb/P204 glubodies). 

Figure 4 shows the gray-scale represenution data from Table 4 (Gb/P204 and 

Gb/P206L glubodies). 

Figure 5 shows the ribbon structure of retinol binding protein (RBP). 
2 5 Figure 6 shows the ribbon structure of cyclophilin. 

^4ndes of Carrvin or Out the Invention 

The invention provides a means to obtain a multiplicity of "glubodies" which 
exhibit a range of binding properties analogous to the range obtainable in vertebrates 
30 by rearrangement of the immunoglobulin loci to obtain antibodies in response to a wide 
range of antigens. The invention takes advantage of the presence in various naturally 
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occurring proteins of "loop" structures, which are substantially irrelevant to the 
remainder of the protein and can be varied in amino acid sequence to generate a wide 
range of binding capabilities. The binding capabilities are not necessarily associated 
with the modified loop portion alone, but are related to the sequences occurring within 
5 that loop in relation to the remainder of the protein, which could be called a 
"framework". 

The femilies of glubodies of the present invention are obtained by modification 
of the protogludomains of protoglubodies. A "protoglubody" refers to a protein 
having a "protogludomain" and a "firework." (More than one such protogludomain 

10 may be found in a single protoglubody.) A "protogludomain" is a region of a protein, 
containing 3-25 amino acids in a contiguous sequence, that: (i) is solvent-exposed or 
solvent-accessible; (ii) forms part of the cavity that defines a binding site; (iii) does not 
interact appreciably with residues outside the region; and (iv) in most cases, and 
preferably, lacks a well defined secondary structure. 

15 Residues contained in a "solvent-accessible** domain are at least partly in 

contact with the bulk water surrounding the protein in solution. Solvent-accessibility 
can be assessed using any of a variety of techniques including, for example, the method 
of Conolly, M.L. Science (1983) 221:709. In many cases, solvent-accessibility wll be 
apparent fi'om visualization of the 3-D structure of a protein (i.e. in some cases a loop 

20 is apparent next to a concavity at the protein surface). Some proteins, such as HIV 
protease, are believed to have solvent accessible domains that are effectively opened 
and closed by a segment of the protein that fimctions as a "cover*' (i.e. the cavity is 
covered by a segment that is capable of opening to allow entry of a iigand). Thus, 
these covered portions can also be considered solvent-accessible at the time the cover 

25 is open. 

The location of a cavity that forms pan of the binding site can be located using 
computer graphics such as those described by Levitt, D.G. ei ai J Mol Graphics 
(1992) 10:229-234. This method displays protein cavities and their surrounding amino 
acids. Those cavities which are associated with binding sites can be identified by 
3 0 correlating the results of this method with standard direa techniques for locating 
binding sites of proteins. 
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A domain exhibits a lack of appreciable interaction with readues outside the 
domain when the residues in the domain have few if any structural contacts with other 
parts of the protein in which they reside (other than the peptide linkage). Thus, 
individual amino add residues in such a topologically independent domain, exhibit few 
5 secondary structural contacts, such as H-bonds, X-X interactions or salt pairs, with 
amino acid residues in the framework outside of the protogludomain. While any 
Umitation on the specific number of interactions would necessarily be arbitrary, it 
appears that less than three hydrogen bonds or salt bridges would be an acceptable cut- 
off point; preferably, no interactions of this type are exhibited. 
1 0 The protogludomain also preferably exhibits little or no secondary structure. 

Lack of such structure is characteristic, for example, of p-tum regions and looser 

"C"-like structures. 

Thus, the protogludomains of the invention represent structures that can 
loosely be referred to as "loops". The individual amino acids in a loop tend to exhibit 
1 5 fewer secondary structural contacts with neighboring amino acids (as compared to 
regions exhibiting pronounced secondary structure such as heUces, sheets and 
hydrophobic cores). Preferred loops of the present invention tend to form part of a 
cavity and can be found on or near the surface of the protein Preferred loops of the 
presem invention also generally exhibit larger than average temperature factors (when 
2 0 the crystal structure is known). Correspondingly, in molecular dynamics simulations, 
preferred loops tend to exhibit larger than average atomic rnouons along a trajectory. 

The "framework" refers to a portion of the protoglubody outside of the 
protogludomains that can act as a "scaffold" onto which a modified protogludomain - 
i.e., a gludomain can be grafted using the techniques of the present invention. The 
2 5 primary fiinction of the framework is to effectively display the protogludomain or 
gludomain on its surface so as to be available for binding target molecules. 

Thus, a preferted framework for the generation of glubody families is contained 

in a protein that is known to: 

1 ) bind peptides and/or small organic (haptenic) compounds; 
30 2) have an attive site/binding loop amenable to modification; 
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3) be monomeric with reasonably low molecular mass although it may be 
assembled into homo and heterodimers or multimers; 

4) be stably expressed in the periplasm of R coli or secreted if possible; 

5) have a binding site distal to the N or C terminus; and 

5 6) tolerate minor modifications of N- and C-termini, for example, for 

purification or detection. 

The segment targeted ~ the protogludomain — 1) should be directly in or 
directly adjacent to the active site of the enzyme or protein; 2) should be directly 
involved in the binding of substrate/ligand molecules; 3) should not be pan of the 
10 hydrophobic core of the protein and thus crucial to the structural integrity; 4) is 
preferably in a flexible loop, notably a p-tum region. 

All of the foregoing properties may be estabhshed by examination of available 
crystallographic structural data, by computational chemistry or by homology modeling. 
To summarize, a "glubody" refers to a modified form of a naturally occurring 
15 protoglubody protein having an unmodified ft"amework and at least one modified 

protogludomain; whereby the resulting gludomain confers altered binding specificity of 
the glubody relative to its corresponding protoglubody. The glubody may be an 
assembly of monomers, one or more than one of which contains a modified 
protogludomain. Thus **glubody** may refer to a modified monomer or to such 
2 0 assemblies. 

A "family" of glubodies refers to a multiplicity of difierent member glubodies 
derived fi'om a single protoglubody, which differ firom each other in the possession of 
different gludomains. 

A "panel" of glubodies refers to a multiplicity of glubodies that may be, but 
25 need not be, members of a single family. 

A "systematically-diversified panel" of glubodies refers to a panel whose 
individual members have been selected such that the panel members are collectively 
capable of binding to a wide variety of other molecules, but wherein there is a 
relatively low level of overlap in the binding specificity of panicular panel members. 
30 Another way to describe this maximal systematically arrived at divershy is in 

terms of the number of principal components needed to capture S0% of the variance in 
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binding of the glubody panel to a set of compounds; that set of compounds would 
include, for example, a set of compounds to which the parent protoglubody and any 
naturally occurring homologs bind. If the values of ICsoS for this set of compounds 
changes idiosyncratically for each of the glubodies in the panel, this increases the 
5 . number of prindpal components. On the other hand, for a panel which contains a large 
number of members which do not react at all with the compounds, only one principal 
component accounts for more than 50% of the variance — i.e., live or dead enzyme. 

Identification and analysis of protoelubodies and protogludomains 

10 A preferred source of protoglubodies includes proteins that are already known 

or believed to be ligates, i.e. proteins that are capable of binding other molecules. 
Among protein ligates, it is most convenient to focus on proteins for which structural 
data are available. Molecular modelling analyses in particular are helpful in initially 
assessing and provisionally confirming that a particular protein will be usefial as a 

15 protoglubody. Protein structures can be assessed using any of a variety of techniques 
including, for example. X-ray crystallography, neutron diffraction and nuclear magnetic 
resonance. Especially preferred are high resolution crystallographic data based on 
crystallization of the protein in combination with a ligand bound to the protein; as 
exemplified in the cases of the human GST-PM, the rat GST 3:3, retinol binding 

20 protein, and cyclophilin discussed below. Analytical methods that provide guidance on 
the overall struaure, even when it has not been solved experimentally, can Also be 
employed. Such methods include, for example, homology modeling based on the 
solved structures of related or structurally similar proteins, as exemplified below. 

With protein structure data available, the first step is the visual inspection of 

25 the three-dimensional structure to identify any protogludomains. For this purpose, the 
Insight II molecular modeling package available fi-om Biosym Technologies, Inc. San 
Diego, CA is suitable. 

A convenient initial screen can be performed by displaying the polypeptide 
firamework of a prospective protoglubody without displaying the associated side 

3 0 chains. Suitable displays include framework "wire" models (in which lines connect all 
atoms along the polypeptide framework) and, more preferably, "ribbon" models (in 
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which the framework is drawn as a ribbon exhibiting turns and helices in the protein 
structure). In such displays^ protogludomains tend to appear as relatively open 
domains, especially open loops, that are somewhat isolated from the remainder of the 
polypeptide framework. 
5 Another useful and easily generated tool for identifying protogludomains 

includes maps illustrating pair-wise distances between all alpha carbons in a 
prospective protoglubody protein. In such maps, the alpha carbons of 
protogludomains tend to exhibit fewer near neighbors than the alpha carbons of other 
regions of the protein. 

10 Preferred protgludomains can also be identified as regions exhibiting relatively 

large temperature factors as crystallographically determined. Moreover, using 
molecular dynamics simulations, preferred loops can be identified as regions exhibiting 
larger than average atomic motions. These motions can be determined using most 
commercially available molecular modeling software, including Discover, BiosymTech, 

15 Inc., San Diego, CA. 

Regions that are part of the hydrophobic core of the protein, regions with 
extensive secondary structure, and regions with extensive secondary structural 
contacts with other parts of the protein are considered part of the framework as well as 
are polar residues that are buried (e.g. residues at the interface where oligomerization 

20 occurs). 

In some instances, the protogludomain loop comprises a p-tum. Cenain amino 
acids, particularly Gly, Ser and Tyr, have a tendency to appear in p-tum regions, see 
Chou P. Y. and Fasman, G. D.. Aimual Review of Biochemistry (1978) 47:251-276. 
As noted below, our analyses of the amino acid distributions in a wide variety of 

2 5 protein ligates suggest that these same amino acids (i.e. Gly, Ser and Tyr) are 
overrepresented in ligand binding sites. 

The most preferred protogludomains of the present invention are solvent- 
accessible loops that are known to bind ligands and that appear to be quite independent 
topologically from the remainder of the protein. 

30 In the absence of an experimentally determined three dimensional structure of 

the prospective protoglubody (or a similar protein), some relevant structural 
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infonnation can still be obtained on the basis of the primary stmcture (i.e. the protein 
sequence). Thus, secondary structure prediaion methods in combination with 
hydrophobicity plots can be used to identify structural motifs that are likely to be 
solvent exposed (see, e.g., the analytical methods described by Fasman, G.D. et al., 
5 Tr^nH^ Rinchem Sci (1989) 14:295-299; and Bcnner, S.A. et aL, Qutt Qp Struct Pipl 
(1992) 2:402-412). Such techniques can be used in conjunction with our sequence 
analytical techniques, described infra. 

DiflFerential distributions of particular amino acids provide additional 
information to predict the position of ligand-binding pockets. We analyzed the amino 

10 acid distribution patterns at ligand binding sites for 50 diverse protein ligates for which 
crystallographic data with bound ligands was available. We found Trp and His were 
present 250% and Arg and His 200% more frequently in contact with the ligand than 
their average observed across all proteins. More modest increases were observed for 
Ser and Asp. Conversely, Pro occurred much less frequently in proximity to the ligand 

15 binding site compared to its overall average. Other residues with decreased 

frequencies were Lys, Glu and Ala. Furthermore, Gly. Ser, Arg and Tyr were the most 
abundant residues within 4/ from the ligand. Thus, protogludomains tend to contain 
these amino acids. 

In some cases, additional evidence will be available suggesting or confirming 
2 0 that a candidate protogludomain is involved in the binding of small molecular targets 
For example, point mutation analyses may have identified residues likely to be involved 
in binding. 

Thus, depending on what sort of information is available or readily obtainable 
for a prospective protoglubody, one or more of the aforementioned techniques can be 

2 5 employed to identify and analyze putative protogludomains within the protein. 

Illustrative examples of the use of these techniques in the context of several different 
types of proteins are described below. 

A preferred subclass of protoglubodies wll exhibit several other features that 
make subsequent steps in the manipulation of glubodies panicularly convenient. Such 

3 0 preferred characteristics include: the ability to be stably expressed in £. coli, the ability 

to be transported to the bacterial periplasm for surface expression on filamentous 
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bacteriophage particles; the ability of the amino* or the carboxy-terminus to be 
modified with a tag for efficient purification (e.g. a hexahistidine tag); and a relatively 
small molecular size (preferably less than about 80 kD, more preferably in the 10-50 
kD range). 

5 Particularly preferred sources of protoglubodies are proteins that already bind 

to a number of small molecules. Among these are the so-called "protective" proteins 
that fimction in mammalian systems in the detoxification of exogenous substances and 
metabolic byproducts of oxidative metabolism. 

Especially prominent among such enzymes are the glutathione S-transferases 

10 (GSTs; Enzyme No, EC 2.5. M8). GST's comprise a family of homodimeric cytosolic 
enzymes that catalyze the conjugation of glutathione (GSH) to a broad range of 
hydrophobic elearophiles. This reaaion is one of the first steps in the inactivation and 
subsequent elimination of toxic xenobiotics which gain entry into the cell. The putative 
binding pocket of GSTs has been proposed to consist of a GSH-binding domain, the 

15 G-site, and a second site, known as the H-site, believed to interact with hydrophobic 
compounds. A number of GSTs have been crystallized and characterized in the 
presence of an inhibitor that binds to the putative active site. The crystal structure of 
human GST-Pl-1, a class Pi GST from human placenta, has been characterized in 
complex with S-hexyl glutathione at 2.8A resolution, Reinemer, P., et al. J Mol Biol 

20 (1992)227:214-226. 

Modification of Protoeludomains 

After identifying a protogludomain that is to be targeted, the mechanical steps 
of mutagenesis can be performed using any of a variety of techniques. Most 

25 conveniently, however, PCR mutagenesis and related techniques can be used to 

randomize or otherwise alter residues in the region to be targeted. Such randomization 
or alteration can be readily tailored to result in the replacement of individual residues 
with any member of a pre-determined set of replacement residues. The replacement 
set can include, for example, the entire set of natural amino acids or pre-determined 

3 0 subsets thereof Mutagenesis of the protogludomain can also include, for example, the 
addition or deletion of one or more residues within the domain. Such an approach can 
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be used to further expand the binding affinities of the families of resulting glubodies by 
generating family members having binding pockets of different sizes. 

Although PCR-based mutagenesis is illustrated in the examples hereinbelow, 
alternative methods for synthesizing the glubodies of the invention can readily can be 
5 envisioned. Although most of them are substantially less convenient, they are at least 
theoretically possible. For example, the DNA encoding the entire glubody may be 
synthesized de novo^ or only portions thereof may be synthesized de novo and ligated 
to portions obtained from cDNA or genomic DNA. The glubodies of the invention 
can be synthesized individually in this manner, or, as described below, an entire family 

10 are synthesized at once. One additional approach involves the use of codon amidiies 
as described in copending U.S. application Serial No. 08/344,820 filed 23 November 
1994, incorporated herein by reference. The use of protein synthesis techniques is also 
theoretically possible. 

Assuming the most convenient method for preparing glubody families is 

15 applied — namely, modification of the protogludomain by altering the amino acid 
sequence thereof at the DNA level and producing the resulting glubodies in 
recombinant host, individual colonies of host cells are cultured to obtain a library 
containing the members of the glubody family. The members of the glubody family can 
then be tested for ability to bind small molecule target candidates using standard 

2 0 biopanning or immunoassay-type techniques. Specific embodiments of these 

techniques are illustrated in the examples below. Depending on the manner in which 
glubody-encoding genes are expressed, the glubodies themselves may be displayed at 
the surface of the cells and/or phagemid particles secreted into the medium, or 
produced intracellularly. The methods for recovering the individual member glubodies 

2 5 will vary depending on the nature of the expression. 

The families themselves may comprise members wherein the 
protoglutodomains have been completely randomized, resulting in large numbers of 
family members, or, preferably, the modifications in the protogludomains can be 
designed to confer maximal diversity in the family members. Techniques and 

3 0 considerations for designing such modified gludomains are based on the considerations 

set forth in U.S. Patents 4,963,263, 5,340,474 and 5,133,866, all of which are 
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incorporated herein by reference. Briefly, consideration is given to the properties of 
the amino add sequence in the resulting gludomain in terms of maximizing diversity by 
supplying monomers that are maximally diverse in regard to at least two parameters 
that afifect binding ability. In addition, advantage can be taken of the propensity of 
5 biding sites to contain preferred amino acid residues as described above. 

Use of Glubodv Fa milies and Panels 

The glubody families of the invention have a diversity of characteristics similar 
to that exhibited by the fiill basal repertoire of antibodies produced by vertebrate 

1 0 species. Accordingly, panels of glubodies can be used in a manner similar to antibodies 
in panels which are capable of fingerprinting individual compounds, matching patterns 
of fingerprints to determine binding capabilities of candidate compounds, and in 
identifying ligand-ligate pairs. These techniques are already described in detail in the 
art and need not be repeated here. Determination of molecular fingerprints for 

15 characterizing a single analyte and using these fingerprints to identify a candidate with 
qualities similar to those of known compounds is described in detail in U.S. Patents 
5,217,869 and 5,300,425, both incorporated herein by reference. 

As described, a single analyte can be characterized by obtaining a profile of 
reactivities of the analyte with the various glubody members of the panel The 

20 characteristic pattern which emerges uniquely describes the analyte in question. The 
pattern of reactivities can either be determined by directly measuring the interaction of 
each member of the panel with the analyte, or by using a competitive technique 
described in the above-referenced patents. In the competitive technique, a diverse 
mixture of mimotopes, which mixture reacts essentially uniformly with each member of 

2 5 the panel is labeled and used to compete with the analyte to measure reactivity with 

respect to each member. 

The use of such pattern matching techniques for analytical purposes is also 
described in U.S. Patent 5,338,659. Use of hmited numbers of the glubody family in 
analytical techniques is also described in U.S. Patent 5,356,784. All of these patents 

3 0 are incorporated herein by reference. 
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In all of these applications, the panels may contain members of a single family 
of glubodies or can be comprised of members from two or more families. The 
selection of panel members depends on the application and availability. 

Individual glubodies can also be used as members of reference panels used in 
5 technologies which translate binding capabilities of known compounds to screen small 
molecules for similar b'mding activities as described in U.S. Serial No. 08/177,673, 
filed 6 January 1994 and U.S. Serial No. 08/308,813, filed 19 September 1994. The 
disclosures of both applications are incorporated herein by reference. 

In addition, individual glubodies can be used as affinity reagents, targeting 
10 agents, drug delivery vehicles, and the like, and in general in any manner that 

antibodies or immunologically reaaive fragments of antibodies can be used. Since the 
glubodies also retain, in many instances, the ability to catalyze chemical reaaions, they 
can be used as catalysts in a manner similar to that of their parent protoglubodies, with 
the added advantage that binding specificity and inhibition profiles can be altered in a 
15 manner appropriate for a particular set of reaction conditions. 

Still another use for the glubodies of the invention is as general catalytic 
reagents. It is well known that antibodies can be used as catalysts in certain instances. 



A summary of antibodies as catalysts can be found, for example, in an article by 
20 Lemer, R.A. et al Science (1991) 252:659-667. 

Glubodies are expected to be superior catalysts as compared to antibodies 
because the binding cleft of antibodies is relatively shallow as compared to that of most 
protogludomains, and indeed as compared to most enzymes. Therefore, in the 
glubodies of the invention, greater surface area is available for binding. 

2 5 Particular antibodies have shown a modest ability to catalyze any of a wide 

range of reactions; the same range should be available with respect to glubodies. Thus, 
the nature of the catalytic activity is not limited by any original catalytic fimction of the 
protoglubody. Furthermore, the deeper clefts in glubodies should allow for better 
catalytic rate increases. 

3 0 Finally, expression of genes encoding glubodies, especially intracellularly, 

provides a method for modulating the metabolism of the cell in essence as a gene 
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therapy tool. Others have provided precedents for this approach. For example, 
Carlson, J.R. in Mol Cell Biol (1988) 8:2638-2646 rcpons the intracellular expression 
of antibodies capable of neutralizing the endogenous yeast enzyme ADH 1 in vivo. The 
cDNAs encoding the heavy and light chains were modified to remove signal sequences 
5 to prevent secretion, and the antibodies produced intracellularly were shown to 
neutralize the activity of the target enzyme in vivo. 

Another report which demonstrates that intracellularly produced proteins can 
successfully interact is by Chien, C.T. et al Proc Natl Acad Sci USA (1991) 88 9578- 
9582. Yeast cells were provided with an expression system containing a binding site 

1 0 for the GAL4 transcription activator protein upstream of a reporter protein-encoding 
sequence, so that expression of the reporter would occur only if the GAL4 protein 
were produced. The GAL4 protein has two regions, both necessary for its function — 
a binding region and a polymerase activating region. These two regions were encoded 
on separate plasmids as part of coding sequences for chimeric proteins where the non- 

15 GAL4-related portion of the one of the chimeras was an amino acid sequence designed 
to interact with the non-GAL4 portion of the other chimera Using the SIR4 protein 
which is known to form homodimers as the binding region of the chimeras, this system 
was successfully used to effect expression of the reporter gene, thus demonstrating 
that the two subunits of the homodimer interacted in vivo. 

20 Thus, in a manner similar to that described above, expression systems for 

production of the glubodies of the invention can be used to modify cells so that the 
glubodies produced interact with targeted proteins that modulate metabolism. Such 
modulation could occur not only by binding targets intracellularly, but also by 
providing unique catalytic activity as described previously. 

25 This last possibility also has some precedent. Gudkov, A. V. ei al. Proc Natl 

Acad Sci USA (1993) 90:3231-3235 reported the isolation from cells resistant to 
drugs that act on topoisomerase-II (Topo II) a total of 12 di£ferent suppressor 
elements that corresponded to short segments of the Topo 11 a molecule or antisense 
RNA sequences which prevented Topo II gene expression. 

3 0 Expression systems for the glubodies of the invention, therefore, in a similar 

manner can be used to alter the characteristics of a cell, including drug resistance. 
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EXAMPLES 

The practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of molecular biology, microbiology, recombinant DNA, and 
5 immunology, which arc within the skill of the art. Such techniques are explained fully 
in the literature. Sec e.g., Sambrook, Fritsch, and Maniatis, MOLECULAR 
CLONING: A LABORATORY MANUAL, Second Edition (1989), 
OLIGONUCLEOTIDE SYNTHESIS (M.J. Gait Ed., 1984). ANIMAL CELL 
CULTURE (R.I. Freshney, Ed., 1987), the series METHODS IN ENZYMOLOGY 

1 0 (Academic Press, Inc.), GENE TRANSFER VECTORS FOR MAMMALIAN CELLS 
(J.M. Miller and M.P. CalosEds. 1987), HANDBOOK OF EXPERIMENTAL 
IMMUNOLOGY, (D.M. Weir and C,C. BlackweU, Eds ); CURRENT PROTOCOLS 
IN MOLECULAR BIOLOGY, (F.M. Ausubel, R. Brent, R E Kingston. D.D Moore. 
J.G. Siedman, J.A. Smith, and K. Stnihl Eds. 1987); CURRENT PROTOCOLS IN 

15 IMMUNOLOGY (J.E. Coligan, A.M. Kruisbcek, D.H. Margulies, E M. Shevach and 
W. Strober Eds. 1991), and PCR PROTOCOLS: A GUIDE TO METHODS AND 
APPLICATIONS (M.A. Innis, D.H. Gelfand, J.J. Sninsky and T J White Eds. 1990). 

The examples presented below are provided as a further guide to the 
praaitioner of ordinary skill in the an. and are not to be construed as limiting the 

2 0 invention in any way. 

Example \ 

Identification and Analysis of Two ProtoGludomains 
m a Human Glutathione S-Transferase fGST-Pl->n Pro toGlubodv 
25 In order to identify potential protogludomains in the human GST-Pl-1 protein, 

we analyzed the protein for presence of the characteristics of gludomatns as described 
above. 

Using the Insight II molecular modelling package available from Biosym 
Technologies (San Diego, CA), we first visualized the three-dimensional stiucture of 

3 0 human GST-Pl-1 using a "ribbon model" based on the crystal structure described by 

Reinemer, P., et ai J Mol Biol (1992) 227:214-226. It had been observed that the 
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folding topology, overall structure and subuoit association of human GST-Pl-1 closely 
resembled the structure of the porcine class Pi GSTs. 

The ribbon model of human GST-Pl-1 is illustrated in Figure 1 . As is apparent 
from the model, human GST-Pl-1 contains an open loop (highlighted in Figure 1) 
5 located at the outer surface of the protein, that is relatively detached from the 
remainder of the protein fi^unework. Moreover, from the data obtained by 
crystallizing GST-Pl-1 in the presence of an inhibitor (S-hexylglutathione), Rcincmer 
{supra), this loop is believed to be adjacent to the binding site for electrophilic 
substrates of the GST-Pl-1 enzyme (the S-hexylglutathione molecule is also depicted 

10 in Figure 1). 

A more detailed analysis of the putative protogludomain was conducted in 
order to identify particular residues that could be suitably targeted by loop 
mutagenesis. The entire segment from residues 36-43 appeared to form pan of the 
loop. Analysis of temperature factors revealed that this region exhibited higher than 

15 average temperature factors indicating a high degree of flexibility, as expected for an 
exposed region with little secondary structure. 

In the case of the protoglubody GST-Pl-1, convenient and well-known assays 
for catalytic activity were available, and detailed information about the catalytic site 
was also available. We therefore decided, for this initial example, to further focus the 

20 modifications on residues that were likely to alter the binding of small xenobiotic 
compounds, without affeaing the binding of the conjugation substrate glutathione. 
Thus, we focused on residues that were close to the S-hexyl group, but not likely to be 
directly involved with glutathione binding. Among the residues within 4A of the 
S-hexyl group are: Tyr-7, which is likely to involved in catalysis; Tyr-106, which is in 

2 5 the region that has been implicated in the formation of salt bridges that stabilize the 

dimeric structure of the GSTs; and Val-3S, which is part of an a-helical portion that is 
believed to form part of both the glutathione binding site and the putative xenobiotic 
binding site (P-site). Lys-44 is also believed to be important for gluuthione binding as 
it is believed to form a salt bridge with the carboxylate terminus of glutathione. 

3 0 The segment from Glu-36 to Leu-43 (the "36-43" loop, which has the sequence 

ETWQEGSL) was selected as the site to be altered. Although Trp-38 is believed to 
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act as a hydrogen bond donor to the Gly residue of glutathione, this single interaction 
should be a relatively weak one. 

Further evidence suggesting that the region just beyond Val 3S might be useful 
as a gludomain came from homology studies and from information about the 
5 prospective binding sites in GST-Pl-1 . First, the most significant differences between 
the human and porcine Pi-class crystal structures are in this region, with the human 
form containing 2 residues that are lacking in the porcine form, a change that does not 
appear to prevent enzyme activity. Thus, individual residues in this region are not 
important to the overall function and stability of the protein. Second, while the GST 

10 isozymes from other classes resemble the Pi form in overall subunit folding topolog>' 
and subunit association, the secondary structure of the region around Val 35 is 
markedly divergent, with no a-helical character at all for this segment in the published 
Mu-class isozyme Ti, X. et al Biochemistry f 1 992) 3 1 : 1 0 1 69- 1 0 1 8 1 . 

As described below, we created a glubody family (the •*Gb/P36" family) by 

15 randomizing all of the amino acids in the segment from Glu-36 to Leu-43, using all of 
the natural amino acids as the set of "replacement residues'*. This loop mutagenesis, in 
which a very large variety of new loops are effectively grafted onto the protoglubody 
framework, was conveniently achieved using PCR mutagenesis, as described below. 
Another protogludomain selected for modification was identified in the 

2 0 C-terminal region of GST-Pl-1 as a loop comprising the residues from Ile-204 to Gln- 
210 (the "204-210" loop, which has the sequence INGNGKQ). The 204-210 loop is 
near the same cavity referred to above with respect to the 36-43 loop. Although the 
C-terminal 204-210 loop exhibited a higher than average temperature factor, it was not 
as high as that observed in the case of the 36-43 loop, suggesting that this C-terminal 

2 5 loop would not exhibit as much flexibility as the 36-43 loop. 

A second glubody family, designated the Gb/P204 family, contained similar 
alterations in this loop. (We also created a glubody family (the "Gb/P36/P204" family) 
in which the glubodies contained modifications at both the 36-43 loop and the 204-210 
loop.) 
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Finally, a glubody family was prepared wherein a randomized five amino acid 
sequence followed by a proline residue was inserted between residues 206 and 207. 
This £amily was designated Gb/P206L. 



5 gxqmplc 2 

Synthesis of the "Gfa/PSe" Family of Glubodies 
The 36-43 loop in the human GST-Pl-1 was completely randomized using 
PCR mutagenesis as described by Barbas et al Proc Natl Acad Sci USA (1992) 
89:4457^61. 

10 The complete sequence of human GST-Pl-1 cDNA is available on GENBANK 

X06547. Human GST-Pl-1 cDNA in the expression vector pKXHPl was obtained 
from Kolm, R.H. et al. as described in "Gluthathione S-Transferases: Molecular 
Cloning, Site Directed Mutagenesis and Structure-Function Studies", G. Steinberg 
thesis. University of Stockholm (1991), and used as a PCR template. Since Sfil was to 

15 be used in subsequent cloning steps, as described below, we eliminated an internal Sfil 
site (at position 573-585 in the human PI cDNA) using overlap PCR muugenesis that 
employed the primers GST-PlSfilmutRC (see Table 1 for all primers). A G-to-A 
substitution at position 582 removed the Sfil site while leaving the amino acid 
sequence unchanged. 
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TABLE 1 

I*rimers 


Primer Name 


Primer Sequence 


GST-PlSfilmutRC 


5'-TCAGGGGAGGCTAGGAGGCCTT 
CjA-3 


1 GST-PlSfilmut 


S'-TCAAGGCCTTCCTAGCCTCCCCT 
GA-3 


1 GST-PlSfil 


5*-CATGCCATGACTCGCGGCCCAG 
CCGGCCATGGCATGCCTCCATACA 
C ACjTTVjTTI A-3 


GluPi-2 


5'.CACGGTCACCACCTCCTCCTTCC 
A-3' 


GluPi-1 


5'-AAGGAGGAGGTGGTGACCGTGN 

NSNNSNNSNNSNNSNNSNNSNNSA 

AAGCCTCCTGCCTATACGGG-3' 


GST-PlMODNotI 


S'-CCAGCATTCTGCGGCCGCCTGTT 
TCCCGTTGCCATTGATGG-3' 


GST-P IMODNotlrandom 


5'-CCAGCATTCTGCGGCCGCSNNS 
NNSNNSNNSNNSNNSNNGGGGAG 
GTTC ACGT ACTC AGG.3 ' 


GST-Loop-Pi 


5'-CCAGCATTCTGCGGCCGCCTGTT 

TCCCGTTGCCGGGSNNSNNSNNSN 

NSNNATTGATGGGGGAGGTTCAC- 

3' 



Two primary amplifications were performed. Reaction 1 contained 10 pmol 
5 each of primers GST-P ISfil and GluPi-2, Perkin-Elmer Tag polymerase buffer (with 2 
mM MgCb), 10 ng of template pKXHPl, all four dNTP's (250 each), and 2.5 units 
of Tag polymerase, in a final volume of 50 nl. Reaction 2 was identical to reaction 1 
except that it contained the primers GluPi-1 and GST-P IMODNotl. Using an 
Onmigene thermal cycler, the reaction mixes were put through 25 cycles of 
10 denaturation (94"C, 1 min), annealing (65*C, 1 min), and extension (72*C, 1 min), 

followed by a final cycle of extension (72**C, 10 min). The reaction products were gel 
purified, subjected to overlap extension, and assembled as follows: 100 ng of purified 
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product from reaction 1 was combined with 100 ng of purified product from reaction 
2, and then added to a PCR reaction mix containing Tag polymerase buflfer (with 2 
mM MgCl2), all four dNTP's (250 jiM each), and 2.5 units of Tag polymerase, in a 
final volume of 50 ^l. This assembly mix was then taken through seven rounds of 
5 denaturation (94''C, 1 min) and annealing (65'*C, 2.5 min), after which 10 pmols each 
of primers GST-PlSfil and GST-PlMODNotI were added and the PGR amplification 
was continued for 25 cycles as above. The resulting product is DNA encoding a family 
of GST-Pl mutants with randomized loops in the position of the original 36-43 loop; 
designated "Gb/P36" cDNA. The cDNA firagment was gel purified, digested with Sftl 

1 0 and Noil, and gel purified once again. 

The purified glubody cDNAs were ligated into a phagemid vector which can be 
used to facilitate expression of the glubodies either in the bacterial periplasm or on the 
surface of bacteria as fijsions to the phage particle. The phagemid pHEN-1. described 
by Hoogenboom, ei al Nucleic Acids Res (1991) 19:4133-4137 was used for this 

15 purpose. 

Digested cDNA (1 ^g) was ligated (using a standard ligation reaction as 
described by Maniatis) to 1 jig Sfil/Notl-restricted pHEN-1. Ligated phagemid DNA 
was then electro-transformed into E. coli strain TG-1 by following established 
procedures Hoogenboom, etal {supra) Transformants were spread onto two 150 
2 0 mm 2X YT agar plates containing 100 fig/ml ampicillin for selection and 1% glucose, 
and incubated at 37*C overnight. 'Approximately 5X10* individual recombinant 
clones were generated. Cells were then scraped from the plates into 5 ml 2 X YT 
medium containing 100 ^g/ml ampicillin and 1% glucose. 100 fU of scraped cells was 
used to inoculate 50 ml of 2 X YT including 100 ^g/ml ampicillin and 1% glucose. 

2 5 This culture was grown at 37*'C with shaking until the ODm was approximately 1 . It 

was then diluted 1 : 10 in the above medium and 5 ^l of VCSM13 helper phage (7 X 
10'° pfii/^1) added. The culture was incubated at 37*C for 15 min without shaking 
and fiirther incubated at 37''C with shaking for an additional 45 min. Finally, this 50 
ml culture was added to one liter of 2 X YT containing 100 jig/ml ampicillin and 50 

3 0 jig/ml kanamycin and incubated overnight at 30'C 
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The next day, phagexnids were prepared by polyethylene glycol precipitation 
using the following protocol. The bacterial culture was then centrifuged at 4000 rpm 
for 10 min. at 4^*0 using a GSA rotor. The supernatant from this cemrifiigation was 
respun at 8000 rpm for 10 min. at 4*C in the GSA rotor. 0.15 volumes (1 50 ml) of 
5 16.7% PEG/3.3M NaCl was added to this second supernatant. The solution was 
mbced well, placed at 4''C for one hour and spun at 8000 rpm for 30 min. in a GSA 
rotor maintained at 4''C. Phage pellets were resuspended in 40 ml dH^O followed by 
the addition of 0. 15 vol of PEG/NaCl. The solution was mixed well, placed at 4*'C for 
20 min. and spun at 8000 rpm in an SS34 rotor at 4*C. The supenuitant from this spin 
1 0 was decanted and the phage pellet resuspended in 2 ml of sterile PBS. Resuspended 
phage were respun for 5 min. at 14,000 rpm in a nucrofuge and the supernatant filtered 
through a 0.45 um sterile filter. The phagemids constituting the library or glubody 
"family" were then titered and used in biopanning experiments described below. 



15 Example 3 

Glubodv Purification Methods 
To obtain purified glubody protein, the cDNA inserts of members of the family 
can be subcloned into pUCl 19Hismyc, a phagemid vertor which directs the expression 
of cloned cDNAs as fiisions to a six residue histidine tag which may be utilized in 

20 cheating affinity chromatography. Bacterial extracts are prepared as previously 

described, except that cells were resuspended and sonicated in column loading buffer 
(50 mM phosphate buffer pH 7.5; 500 mM NaCI; 20 mM imidazole). Extracts are 
spun at 10,000 rpm for 30 minutes at 4*C, and then supematants are loaded onto a 
Ni-NTA resin column (Qiagen, Chatsworth, CA) and washed with 5 column volumes 

25 of 50 mM phosphate buffer pH 7.5, 500 mM NaCl, 35 mM imidazole. Glubodies are 
eluted with SO mM phosphate buffer; 500 mM NaCl and 100 mM imidazole; and 
collected in six 1 ml fractions. 

Randomly picked individual members of the glubody library, randomized in the 
region from residues 36 to 43, are screened for immunoreactivity in Western blot 

3 0 analysis to be sure the vector constructions had been effeaive. In most instances, 

recombinant proteins from baaerial extracts are detected with antibodies against both 
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GST-Pl and a fragment of c^myc^ which links the C-teminus of glubodies with the 
N-terminus of phage gene m (data not shown). 

Alternatively, glubodies can be prepared intracellularly and extracted. Bacterial 
cultures are centrifiiged at 7000 X g in a Soivail SS-34 rotor for 5 min. at 4*C. Cell 
5 pellets are frozen in dry ice/ethanol and resuspended in lysis buffer [10 mM Tris-HCl, 
pH 7.8, 50 mM EDTA, 15% glucose and 1 mg/ml lysozyme (Sigma)]. PMSF is added 
to a final concentration of 250 TM and the solution allowed to sit on ice for 1 hr. The 
suspension is sonicated for 2 min. with a Branson Sonifier 450 at 50% dut>' cycle and 6 
output setting. Samples are centrifiiged for 30 min. at 14,500 rpm (25,000 X g) in the 
10 Sorvall SS-34 rotor at 4*C. The supernatant is collected and stored at 4''C until 
fiirther use. 

Alternatively, an overnight culture of HB2151 grown in 2 X YT is diluted 1 :50 
and grown until the ODeoo is 0.5. The culture was then infected with 2 |il of phagemid 
supernatant from recombinant glubodies originally propagated in bacterial strain TG- 1 . 

15 The culture was incubated for one hour at 37**C with shaking after which time 
ampicillin was added to final concentration of 100 ^g/ml and the culture further 
incubated for one hour at 37®C with shaking. IPTG is then added to a final 
concentration of 1 mM and the culture incubated overnight at 30'C with shaking (225 
rpm). Cultures are collected and cells pelleted at 14 K rpm in an Eppendorf microfiige. 

20 Supematants are transferred to separate tubes for use in assays. 

Example 4 

Analvsis of the Binding Properties of "Gb/PSe" Glubodies 
Gb/P36 glubodies were first tested for glutathione-S-transferase activity as 
25 measured by their ability to conjugate CDNB (1-chloro 2,4 dinitrobenzene) to GSH 
(glutathione), with an accompanying change in OD at 340 nm. Conjugation of GSH 
and CDNB was followed for 5 min at 30''C by measuring the absorbance at 340 nm in 
a thermostated microliter plate reader (Molecular Devices). Although the recombinant 
glubodies were quite similar by the structural analyses described above, they exhibited 
3 0 variation in their enzymatic propenies. Of 30 randomly picked glubodies, 14 retained 
measurable enzyme activity in a control assay with the indicator substrate CDNB 
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(other clones were undetectable over background or had very low aaivitxes). Thus, 
approximately one half of the proteins in which the gludomain had been completely 
randomized retained the ability to bind CDNB and glutathione in their active sites, and 
were able to eflFectively catalyze the transferase reaction. 
5 Further assessment of the binding properties of the novel glubodies was made 

by performing the catalytic reaction in the presence of sfaaeen potential inhibitors 
selected from different chemical classes. IC50S were measured in the standard GST 
conjugation assay which contains 1 mM GSH and 1 mM CDNB in 200 mM sodium 
phosphate, pH 6.8. Compounds were assayed for their inhibitory activity at 250, 50. 
10 10, 2 and 0.4 mM. The potency of the inhibitors used had previously been found to 
vary among the natural GSTs. 

The IC30 data, summarized in Table 2 as -log IC30 (^M), demonstrate that a 
number of the resulting glubodies exhibited novel binding profiles with respect to this 
panel of compounds. (N.D. - not determined.) 
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TABLE 

-log ICso Vail 


: 2 

les (TM) 




Sequence Inhibitor 


Native 
ETWQEG 
SL 


Gb/36-1 
FRRLLPS 
L 


Gb/36-2 
WRWEIL 
V 


Phloxine B 


22.00 


N.D. 


N.D. 


Fluoresceinamine, Isomer II 


2500.00 


2500.00 


2500.00 


Cibacron Brilliant Red 3BA 


3.30 


20.15 


23.84 1 


Fluorescein Isothiocyanate, Isomer I 


146.00 


613.17 


340.89 1 


9-Phenyl-2,3,7-Trihydroxy-6-Fluorone 


2.80 


463.46 


2500.00 


2-(4-(Fluorosulfonyl) Phenoxy) Acetic 
Acid 


2500.00 






Ibuprofen 


2500.00 


2500.00 


2500.00 


Cephaloglycin 


73.36 


112.16 


162.01 


1 hexyl-glutathione-Phenylglycine 


0.97 


2500.00 


2430.64 


1 Octyl Glutathione 


7.10 


201.44 


204.85 


j (S)-6-Methoxy-A-Methyl-2- 

j Naphthaleneacetic Acid 


2500.00 


2500.00 


2500.00 


1 ,2,3 ,4-Tetrafluoro-5, 8-Dihydroxy- 


10.40 


528.78 


2500.00 


Ranitidine 


2500.00 


2500.00 


2500.00 


6-Chloro-3-Nitro-2H-Chromene 


0.10 


0.44 


0.17 


Cholecalciferol 


13.06 


2500.00 


2500.00 


1 , 1 '-Dibenzovlferrocene 

_: 


10.22 


N.D. 


N D 


1 , 1 '-Dibromosalicil 


0.51 


116.65 


121.67 


Dienestrol 


124.08 


2500.00 


2500.00 



In order to demonstrate that activity in cell extracts was not influenced by 
5 contaminating bacterial proteins, one clone, Gb/204.3 (see below), was grown in larger 
quantities, purified by S-hexyl glutathione aflRnity chromatography, and retested. 
There were no significant changes in the binding profile of the partially-purified GP3 
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glubody preparation and the afiSnity-purified GP3, indicating that the binding 
properties of the crude extracts were indeed the resuh of GP3. 

A more sophisticated analysis of the binding dau in Table 2 provided further 
evidence of the significant changes in binding properties among the novel glubodies. 
5 Pair-wise comparison of the natural GSTs from diflFerent classes reveals significant 
overlap in their specificities, with correlation coeflBcients of 0.7 to 0.8. Comparing the 
novel glubodies described in Table 2 with the recombinant GST-Pl-1 proioglubody, 
the highest correlation found was only 0.4. Moreover, panicular glubodies such as 
Gb/36-1 and Gb/36-2 are also different from each other. 

10 

Example 5 

Synthesis of the "Gb/P204'^ Library of Glubodies 
For loop mutagenesis of the C-terminal 204-210 loop (INGNGKQ), a primary 

PCR reaction was performed as described above except that 10 pmols of the primer 
15 GST-PlMODNotrandom was used to replace GST-PlMODNotI in the amplification 

PCR products were purified and digested as above. The mutant cDN A generated from 

this reaction was designated "Gb/P204" cDNA. 

The members of this glubody family were produced in E. coli in a manner 

analogous to that set forth for the Gb/P36 family as described in Examples 2 and 3. 
20 The glubodies of the Gb/P204 family were tested for gluthathione S-transferase 

activity as measured by ability to couple CDNB as described in Example 4. This 

catalytic reaction is performed in the presence of 18 potential inhibitors selected from 

different chemical classes to obtain IC50 data for these proteins. Again, approximately 

half of the glubodies retained this ability. 
25 Illustrative members of this family and their IC50 values (-log IC50 (fiM)) with 

respect to various inhibitors are listed in Table 3. (N.D. = not determined.) 
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TABLES 
-log IC90 Values (mM) 


Sequence Inhibitor 


Native 
INGNG 


Gb/204-1 
PEQHAP 


Gb/204-2 
HPDPPQ 
A 


Gb/204.3 
MATGN 
R 


Gb/204.5 
GERRLE 


Pnloxine 0 


y.27 


1 A 1 

134.41 


132.72 N.D. 


103.76 


1 Fluoresceinamine, Isomer U 


2415.24 


XT T\ 

N.U. 


NX) 


2500.00 


N.D. 


1 Cibacron Briliiani Red 3BA 


1.10 


59.40 


119.79 


4.11 


37.54 


1 Fluorescein Isothiocyanate, 
H Isomer I 


2500.00 


352.88 


314.00 


2500.00 


498.21 


1 9-Phenyl-2,3,7-Trihydroxy- 
1 6-Fluorone 


37.97 


564.67 


2500.00 1 2500.00 


2500.00 j 


1 2-(4-(FluorosuifonyI) 
a z^nenoxyj /vceuc Acia 


2500.00 


897.83 


2500.00 


j 2500.00 


167.42 


o louproien 


2500.00 


2500.00 


2500.00 


2500.00 


2500.00 


1 Cephalogiydn 


1527.76 


665.02 


1052.42 


2500.00 


428.30 


1 hexyl-glutathione- 
Phenylglycine 


3.57 


331.60 


N.D. 


24.02 


319 84 


L^ciyi vjjuiainione 


13.89 


2500.00 


2500.00 


63 49 


352.73 


(S)-6-Methoxy-A-Methyl-2- 
1 Naphthaieneacetic Acid 


2500.00 


2500.00 


2500.00 j 


2500.00 


2500.00 1 


1 l,2,3,4-Tetrafluoro-5,8- 
Dihydroxy-Anthraquinone 


61.35 


113.18 


196.56 ' 2500.00 

1 


2500.00 


Ranitidine 


2500.00 2222.48 


1758.29 


2500.00 


2122.09 


6-Chloro-3-Nitro-2H- 
Chromene 


1.18 


2500.00 


2500.00 


6.84 


2500.00 


Cholecalciferol 


2500.00 2500.00 


2500.00 ! 2500.00 


2500 00 1 


1 , 1 '-Dibenzoylferrocene 


69.92 


257.93 


1136.10 


2500.00 


227.23 


1 , 1 '-Dibromosalicil 1 2.66 


N.D. 


N.D. N.D. 


N.D 


Dienestrol 


785.26 1 


2500.00 


2500.00 151.07 2500.00 
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As was done with respect to the Gb/P36 family, pair-wise comparison was 
made with respect to the native GST and between individual members of the glubody 
family Gb/P204. The results of these pair-wise correlations are shown in Figure 2. As 
shown, the correlation in binding properties among the various glubodies and the 
5 protoglubody varies from poor to nonsignificant. Similarly, the correlation among the 
glubodies themselves varies substantially. In contrast, natural isozymes of this 
protoglubody are known to be strongly correlated in binding properties despite primary 
amino acid sequence differences far larger than among the glubodies. This indicates 
that the protogludomain has a strong influence on binding properties. 
10 Figure 3 summarizes the data of Tables 2 and 3 in "gray scale" form. In this 

scale, black boxes represent the most potent compounds and white boxes represent no 
detectable inhibition. PI -1 is the parental recombinant enzyme. recPl is the 
recombinant enzyme expressed with a C-terminal c-myc ug. 

15 Example 6 

Synthesis of the "Gb/P206 Loop" Library of Glubodies 
Another glubody library, the "Gb/P206L" library, was prepared by loop 
mutagenesis of the human GST-Pl-1 protoglubody, to generate insertions expanding 
the size of the 204-210 loop. In this "expansion" loop mutagenesis, each glubody 

2 0 receives a novel loop comprising a single hexapeptide insertion between residues 205 

and 206. The insertion contained five random amino acid residues followed by a 
proline residue. The resulting loops thus comprise the sequence 
IN(XXXXXP)GNGKQ. The cDNA for this library was generated under the same 
PCR amplification conditions described above except that the 3* oligonucleotide primer 
25 was GST-Loop-Pi (see primers in Table 1). 

Table 4 provides IC50 values (-log IC50 (jiM) for two members of the 
Gb/P206L family (Gb/8 and Gb/12) and three additional members of the Gb/P204 
family (Gb/19, Gb/21, and Gb/23). Gb/19 showed a fi-ame shift subsequent to residue 
204; therefore all 19 amino acids downstream of this residue (as opposed to 6) were 

3 0 different from the native sequence. As might have been expected, this glubody shows 

unresponsiveness to most inhibitors, and is, thus, a perhaps unintentional control. 
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The values in Table 4 represent the mean ±1S.D. of three separate assays. N.I 
= not inhibitory. 

Table 4a shows correlation coefiBdents with respect to the glubodies assayed in 
Table 4. Again, Gb/19 shows comparatively low values whereas the remaining 
5 glubodies behave analogously to the native enzyme (PI -1) and to the recombinant^ 
prepared dimer (rP 1 - 1 ), 

Figure 4 shows the results obtained in Table 4 as the "gray scale" described 

above. 
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Table 4a - Corrd 


ation Coef! 


icients 








m 1 


r-Fl-1 


OD-5 


Gb-12 


Gb-19 


Gb-21 


Gb-23 


m 1 
r 1 1 


1 .uuu 














rPll 


0.836 


1.000 












Gb8 


0.470 


0.541 


1.000 










Gbl2 


0.592 


0.591 


0.502 


1.000 








Gbl9 


0.291 


0.356 


0.254 


0.025 


1. 000 






Gb21 


0.749 


0.720 


0.478 


0.647 


0.307 


1.000 




Gb23 


0.503 


0.390 


0.175 


0.412 


0.314 


0.547 


1.000 



Example 7 

Synthesis of the "Gb/P36/204" Library of Glubodies 
5 Using essentially the same techniques as described aboye, we created a glubody 

family, designated the "Gb/P36/204** library or family wherein loop mutagenesis at 
both the 36-43 loop was conducted as described in Example 2 in synthesizing the 
Gb/P36 library and the 204-210 loop as described in Example 5 to prepare the 
Gb/P204 family. 

10 

Example 8 

Identification and Isolation of Glubodies Binding Particular Targets 
Recombinant glubodies expressed on the surface of phagemid particles, as 
described above, are panned against desired ligands in a procedure essentially 
15 described by Marks, J.D. etal J Mol Biol (1991)222:581-597. 

Briefly, 96-well plates are coated with 5 ^g streptayidin per well in 100 fxl 
coating buffer (0. 1 M NaHCOj) oyemight at 4**C. The streptayidin solution is 
removed and replaced with blocking buffer (2% dry milk in PBS) for 30 min. at RT 
Wells are then washed 5 X PBS/Tween (0.02%) followed by two washes with PBS 
20 and overiayed with 1 fig biotinylated candidate targets in 100 fxl PBS for one hour at 
RT. Wells are washed as above and lO'^-lO^^ phagemid particles from rescued 
libraries added to each well followed by incubation for two hours at RT. Unbound 
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phage are removed and the wells washed 10 X PBS/Tween (0.02%) followed by two 
washes with PBS. 

Alternatively, 10^^ phagemids are preincubated with 1 ng - 1 biotinylated 
candidate target in solution for two hours at RT and then added to the 
5 streptavidin-coated wells for an additional 30 min. and washed as above. 

Phagemids are then duted with 100 ^1 0. 1 M HCl pH 2.2 containing 1 mg/ml 
BSA for 10 min. at RT. Samples are immediately neutralized with 7 ^1 2 M Tris base. 
Eluted phagemids are amplified by infecting 2 ml TG-1 (OD600 = 0.8-1 .0) with the 
entire elution and allowing the culture to sit at RT for 1 5 min. 10 mi of prewarmed 

1 0 (37*C) 2 X YT containing 40 ^g/ml ampicillin is added and the culture grown at 37'C 
for one hour with shaking. The ampicillin concentration is adjusted to 100 ^xg/ml and 
the culture is incubated for 45 min. at 3TC in the shaker (250 rpm). Shaking is then 
slowed to 100 rpm for 15 min. to allow pili to regenerate. VCSM13 helper phage 
(10^^ pfii) is added and the culture is incubated at 37*C for 15 min. without shaking. 

15 The culture was transferred to 200 ml 2 X YT containing ampicillin at 1 00 ^ig/rnl and 
incubated in the shaker for one hour at 37**C. Finally, kanamycin was added to a final 
concentration of 50 jig/ml and baaeria were grown overnight at 30*0, 225 rpm. 

Binding to a target candidate can also be tested using ELIS A assays. 96-well 
plates are coated with 5 jig/well streptavidin in 100 nl 0. 1 M NaHCOs pH 9.2 in a 

20 humidified chamber overnight at 4''C . The streptavidin solution is removed and 
replaced with 100 ^1 blocking solution (1%BS A/PBS) and incubated for 30 min. at 
RT. Wells are washed 5 X PBS/Tween (0.02%) followed by two rinses with PBS. 
One ^g of biotinylated target candidate is added to each well in 100 fxl PBS for one 
hour at RT. Wells are washed as above and 10^° phagemid panicles fi-om individual 

25 recombinants, polyclonal amplified phagemid populations or soluble glubody protein 
extract is added to the wells. Wells are washed as above and 100 pi of anti-M13 
polyclonal IgG at a 1 : 1000 dilution or mAb 9E10 (Santa Cruz Biotechnology. Inc., 
Santa Cruz, CA; anti-c-myc tag) at 1 jig/ml in 2% dry milk/PBS was added for one 
hour at RT. 

3 0 Plates are washed as above and secondary antibody e.g alkaline 

phosphatase-conjugated goat anti-rabbit or goat anti-mouse antibody diluted 1 : 1000 in 



wo 9603879 PCTAJS96A)I567 

-33 - 

2%MiUc/PBS] is added for one hour at RT. WcUs are washed 3 X PBSyTwcen 
(0.02%) followed by two rinses with dH20. The wells are then developed with 100 ^1 
of 10 mM diethanolamine, 1 mM MgCb containing 1 mg/ml pNPP (p-nitrophenyl 
phosphate) and read at 40S mn in an ELJS A plate reader (Molecular Devices, Palo 
5 Alto, CA). 



Example 9 
Additional Protoglubodies 
Structures having the features of protogludomains (as described above) are not 
1 0 very common, but they are relatively easy to detect using the methods disclosed here in 
proteins of a wide variety of different types, including proteins that would not 
otherwise be experted to be useful for these applications. 

Further analysis of potential candidates using molecular modelling resulted in 
the elucidation of a number of different proteins likely to be useful for the production 
15 of glubodies using the techniques described herein. Several illustrations of these are 
described below. 



A. Identification and Analvsis of Gludomains in a Rat GST (GST 3 .3^ 
ProtoGlubodv 

20 The crystal structure of rat glutathione S-transferase ("GST 3:3") has been 

determined in a complex with glutathione at 2.2A resolution. See Ji, X, et ai 
Biochemistry (1992) 31 10169-10181. The rat GST 3:3 data was obtained from the 
prerelease directory of the Brookhaven Protein Databank (Brookhaven code: IGST) 
Using the techniques described above, we were able to identify a potential 

2 5 gludomain at residues 32-48. 



B. Human DHFR 

Human dihydrofolate reductase (DHFR) exhibited at least two regions that are 
likely protogludomains: Lys-18 to Leu-22 (the "18-22 loop"), and Phe-58 to Arg-65 
3 0 (the "58-65 loop") In particular, both of these domains exist as solvent-exposed 

loops, and the limited hydrogen bonding engaged in by amino acids in these domains is 
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essentially restricted to amino adds within the loop. Thus, these solvent-exposed 
loops also exhibit topological independence. 

In contrast, amino acids immediately outside each of the loops engage in more 
substantial interactions with other parts of the protein. Outside of the 1 8-22 loop, for 
5 example, Gly-17 forms a hydrogen bond with Asp-14S, and Pro-23 interacts with a 
water molecule that in turn interacts with Ser-144. 

Additional evidence suggesting that these two loops will be useful as 
protogludomains comes from crystallization studies. In particular, both the 18-22 loop 
and the 58-65 loop are situated within 10 A from the bound folate molecule in the 
10 crystal structures studied determined by Davies H, J.F. et al Biochemistry (1990), 

29:9467 and are thus contiguous with a cavity in the protein that could act as a binding 
site. 

Other features also contribute to making human DHFR a preferred 
protoglubody: the cDNA of human DHFR is available (Nienhuis, A W. et al,, J Biol 
15 Chem (1984) 259:3933-43; the human recombinant protein has been expressed in 

E. coli; and the protein is of a relatively convenient size (it is a dimer comprising 186 
residues per monomer). 



C. Human Retinol Binding Protein 

20 Retinol binding protein (RBP) is synthesized in hepatocytes, loaded with 

retinol, and secreted into the plasma where it serves to deliver retinol to various tissues 
and organs. Human RBP is a single-chain 21 kD protein and is a member of the 
lipocalin superfamily. Lipocalins are involved in ligand transport and include, besides 
RBP. P-lactoglobulin and bilin binding protein. Human RBP has been produced 

25 recombinantly in the cytosol (Wang, et al Gene (1993) 132:291-294); in the periplasm 
of£. coli (Sivaprasadarao, etai Biochem J (1993) 296:209-215); and in secreted form 
from £. coli (our unpublished resuhs). The secreted form has been shown to retain 
retinol binding acti\ity. 

Human RBP has been characterized crystallographically by Cowan, S. et ai 

3 0 Proteins (1990) 8:44-61 and a ribbon structure for this protein is shown in Figure 5. 
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The structure shows two candidate protogludomains in the region Val61-Val69 and 
Gly92-Gln98. Both are solvent-exposed loops that form part of the binding site. 

D. R co/iBiotin Repressor 

5 Another illustrative example of a protein exhibiting protogludomains is the 

biotin repressor protein from E. coli. In particular, the region from Tyr-11 1 to Arg- 
1 18 is a solvent-exposed loop in the form of a p-strand segment that engages in few if 
any secondary structural contacts with other parts of the structure. The 111-118 loop 
lies on top of the biotin molecule. 
1 0 The cDNA of the E. Coli biotin repressor is available (Otsuka, et al. , Gene 

(1985) 35:321-331 (1985); the protein has 321 residues; and the crystal structure has 
been solved (Wilson, K.P. et ai Proc Natl Acad Sci (1992) 89:9257). 

E. Streptomvces Streptavidin 

15 Another illustrative example of a protein exhibiting protogludomains is the 

streptavidin protein from Streptomyces. The region from Gly-1 13 to Lys-121 is a 
solvent-exposed loop that engages in few if any secondary structural contacts with 
other parts of the structure. The 113-121 loop lies near the entrance of the cavity that 
binds biotin. The residues in the loop are approximately 12 A from the bound biotin 

2 0 molecule. While this distance is larger than in the examples above, the loop does form 
part of the cavity where the biotin molecule is bound, and alteration of the loop can be 
expected to result in changing the electrostatic propenies within the cavity itself, which 
would be predicted to alter the binding profiles for the resulting glubodies. 

The cDNA of streptavidin derived from Streptomyces avidinii is available 

25 (Cantor, etal. Nucleic Acids Res (1986) 14:1871-1882; the protein comprises 159 
residues; and the crystal structure is available (Weber, P.C. ^/ aL Science (1989) 
243:85). 



30 



F. Human Cvclophilin 

The cyclophilins are a family of highly conserved proteins that display high 
affinity and binding to cyclosporin A, catalyze the cis/trans isomerization of a peptide 
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bond between proline and its N-terminal neighbor, and are thought to be involved in 
the late stages of protein folding. Human cyciophilin has been produced recombinantly 
in the cytosol of £. coli (Liu et al Proc Natl Acad Sci USA (1990) 82:2304-2308 and 
a naturally occurring periplasmic E. coli cyciophilin homolog has been isolated (Liu et 
5 ai ibid 4028-2032). 

Human cyciophilin binds to peptides smaller than cyclosporin and has been 
crystallized in a complex with a tetrapeptide (Kallen et al. Nature (1991 ) 353:276- 
279). The crystallographic structure has been determined by Ke, H. et al. Proc Natl 
Acad Sci USA (1991) 88:7483. The region spanning Lysl 18-Lysl25 has the 
1 0 characteristics of a protogludomain in that it is solvent exposed, forms limited 
interactions with noncontiguous residues and is part of the ligand binding site. 

The ribbon structure of the rat counterpart of human cyciophilin is shown in 
Figure 6. Rat and human isozymes have 96% conserved amino acid sequence, and the 
rat cyciophilin cDNA is therefore used to create a glubody family in the Lysl IS- 
IS Lysl 25 region in a manner analogous to that described above for the creation of 
Gb/P36. 
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Claims 

1 . A method to prepare a family of member protein glubodies wherein said 
family binds to or reacts with a variety of ligands, which method comprises 
5 identifying a proto-gludomain in a proto-glubody protein; and 

effecting in said proto-gludomain of each member of a multiplicit>' of molecules 
of said proto-glubody protein, an alteration of the amino acid sequence, wherein said 
alteration is different for each member. 

10 2. The method of claim 1 wherein said alteration includes substitution of 

one amino acid for another and/or deletion of one or more amino acids and/or insertion 
of one or more amino acids. 

3. The method of claim 1 wherein said alteration comprises substitutions 
15 of 1-6 amino acid positions in said proto-gludomain randomized among said members, 

or 

wherein said alteration comprises substitutions of 1-6 amino acid positions in 
said proto-gludomain designed to produce a maximally diverse multiplicity of 
glubodies. 

20 

4. The method of claim 1 wherein said alteration is effected by 
mutagenizing the nucleotide sequence encoding said proto-gludomain in each member 
of said muhiplicity of molecules of said proto-glubody protein. 

25 S. A method to obtain a protein ligate glubody of desired binding 

properties, which method comprises 

identifying a proto-gludomain in a proto-glubody protein; 
effeaing in a proto-gludomain of each member of a multiplicity of molecules of 
said proto-glubody protein, an alteration of amino acid sequence wherein said 
3 0 alteration is different for each member; and 

selecting from said multiplicity a protein ligate glubody of desired properties. 
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6. A multiplicity of protein glubodies, wherein said multiplicity binds to or 
reacts with a variety of ligands, 

wherein each said glubody has a modification in the protogludomain of a 
5 protoglubody, and wherein each member of said multiplicity has a diflferent altered 
amino acid sequence in its proto-gludomain. 

7. The multiplicity of claim 6 wherdn said altered amino acid sequence 
comprises substitution of one amino acid for another and/or deletion of one or more 

10 amino acids and/or insertion of one or more amino acids. 

8. The multiplicity of claim 6 wherein said altered amino acid sequence 
comprises substitutions of 1-6 amino acid positions in said proto-gludomain 
randomized among said members. 

15 

9. The muhipHcity of claim 6 which comprises a systematically diversified 
panel of glubodies. 

10. The mukiplicity of claim 6 which is a family of glubodies wherein said 

2 0 altered amino acid sequence is prepared by a process that comprises mutagenizing the 

nucleotide sequence encoding said proto-gludomain in each member of said 
multiplicity of molecules of a single protoglubody protein. 

1 1 . The multiplicity of claim 10 wherein said protoglubody is selected from 
25 the group consisting of a glutathione S-transferase, a retinol binding protein, a 

cyclophilin, a dihydrofolate reductase, a ferredoxin, a biotin repressor, a streptavidin 
protein and a ricin protein. 

12. The multiplicity of claim 1 1 wherein said protoglubody is a glutathione- 

3 0 S-transferase or a retinol binding protein. 
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A composition of DNA molecules which encodes the multiplicity of 

14. A composition of DNA molecules which encodes the multiplicity of 
5 claim 10. 

15. A composition of DNA which comprises expression systems eflfective in 
producing the multiplicity of claim 6. 

10 1 6 A composition of DNA which comprises expression systems effective in 

producing the multiplicity of claim 10. 

1 7. A method to characterize a single analyte, which method comprises: 
contacting said analyte with each member of a panel of glubodies said 
15 glubodies having characteristics of the multiplicity of claim 6; 

detecting the degree of reactivity of said analyte to each of said glubodies, 
recording said degree of reactivity of said analyte to each of said glubodies; and 
arranging said recorded degrees of reactivity so as to provide a characteristic 
profile of said analyte. 

20 

18 The method of claim 17, wherein said detecting is by reacting 
unlabelled analyte competitively with a diverse mixture of labeled mimotopes with 
respect to each of said glubodies, which mixture is approximately equally reactive with 
each glubody in said panel and measuring the reduction in binding of the labeled 
2 5 mixture to each glubody in the panel. 

19. The method of claim 18 wherein said glubodies are coupled to a solid 
support in a predetermined pattern. 
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20. A method to identify a candidate, which candidate will be efifeaive in 
reacting with a target, wherein said target has a known ligand with which it reacts, 
which method comprises: 

contacting said candidate with each of a panel of glubodies, which glubodies 
5 . react in a multiplicity of differing degrees with said candidate; 

detecting the degree of reactivity of said candidate to each of said glubodies; 
recording each said degree of reactivity of said candidate to each of said 
glubodies; 

arranging said recorded degrees of reactivity so as to provide a characteristic 
10 profile of said candidate; 

comparing said profile to a profile analogously obtained of said ligand with 
respect to said muhiplicity of glubodies; 

wherein similarity of the profile of said candidate to the profile of said ligand 
indicates the ability of the candidate to react with said target. 

15 

2 1 . The method of claim 20 wherein said target is a receptor and the 
candidate is a candidate drug. 

22. The method of claim 20 wherein said glubodies are coupled to a solid 
20 support in a predetermined pattern. 



23. A method to identify a candidate reactive with a target, which method 
comprises: 

(a) providing a formula that represents a combination of the reactivity 

2 5 profiles with respect to a first set of candidates of at least two members of a panel 

comprising the muhiplicity of glubodies of claim 6, which formula calculates a 
predicted profile that best matches the reactivity profile of the target with respect to 
said first set of candidates; 

(b) testing the reactivity of said at least two members of the panel with 

3 0 respect to a candidate; and 
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(c) calculating a predicted reactivity with respea to the target for said 
candidate by applying said formula to the reactivities determined in step (b) to estimate 
the reactivity of the candidate with respect to the target; and identifying a substance as 
being a candidate prediaed to react with the target 

5 

24. The method of claim 23 which fiirther includes the step of assembling 
the identified substance firom starting materials appropriate to said substance. 



25. A method to modulate the metabolism of a cell which method 
10 comprises culturing said cell, which has been modified to contain an expression system 
effective in producing a glubody identified by the method of claim S under conditions 
wherein said glubody is produced intracellularly so as to effect an interaction between 
said glubody and intracellular components to modulate said metabolism. 
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